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Abstract — The problem of cooperative fusion in tlie presence 
of Byzantine sensors is considered. An information theoretic 
formulation is used to characterize the Shannon capacity of 
sensor fusion. It is shown that when less than half of the sensors 
are Byzantine, the effect of Byzantine attack can be entirely 
mitigated, and the fusion capacity is identical to that when all 
sensors are honest. But when at least half of the sensors are 
Byzantine, they can completely defeat the sensor fusion so that 
no information can be transmitted reliably. A capacity achieving 
transmit-then-verify strategy is proposed for the case that less 
than half of the sensors are Byzantine, and its error probability 
and coding rate is analyzed by using a Markov decision process 
modeling of the transmission protocol. 

Index Terms — Sensor Fusion, Byzantine Attack, Shannon Ca- 
pacity, Network Security. 

I. Introduction 

WIRELESS sensor networks are not physically secure; 
they are vulnerable to various attacks. For example, 
sensors may be captured and analyzed such that the attacker 
gains inside information about the communication scheme and 
networking protocols. The attacker can then reprogram the 
compromised sensors and use them to launch the so-called 
Byzantine attack. This paper presents an information theoretic 
approach to sensor fusion in the presence of Byzantine sensors. 

A. Cooperative Sensor Fusion 

We consider the problem of cooperative sensor fusion as 
illustrated in Fig. [J where the fusion center extracts informa- 
tion from a sensor field. By cooperative fusion we mean that 
sensors first reach a consensus among themselves about the 
fusion message. They then deliver the agreed message to the 
fusion center collaboratively. We will not be concerned with 
how sensors reach consensus in this paper; see e.g., [1]. We 
focus instead on achieving the maximum rate of sensor fusion. 

The sensor fusion problem is trivial if the consensus is 
perfect, i.e., all the sensors agree on the same fusion message. 
If the fusion center can only communicate with one sensor 
at a time, and there is no limit on how many times a sensor 
can transmit {i.e., no energy constraints), there is no differ- 
ence between having a single sensor delivering the message 
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Fig. 1 . Cooperative sensor fusion in the presence of Byzantine sensors. 



and having any number of sensors transmitting the message 
collaboratively. The capacity of such an ideal fusion is given 
by the classical Shannon theory 



C = max/(X;y) 

p(x) 



(1) 



where X is the transmitted symbol by a sensor, Y the 
received symbol, and p{x) the distribution used to generate 
the codebook. Also for this case, even if there is a feedback 
channel from the fusion center to sensors, the capacity does 
not increase [2]. 

Cooperative fusion becomes important if consensus cannot 
be reached, i.e., there is a probability /? > that a particular 
sensor is misinformed about what message to transmit. Thus 
there is a positive probability that a particular sensor communi- 
cating with the fusion center is delivering the wrong message. 
It is no longer obvious what the capacity of sensor fusion 
is. In [3], a number of sensor fusion models are considered, 
and the fusion capacity is obtained for several cases. Most 
relevant to this paper is the fusion model in which there is a 
feedback channel from the fusion center to individual sensors, 
and the fusion center polls specific sensors for transmissions. 
Optimized among all polling strategies, it is shown that, for 
any /3 < 1, the fusion capacity is also given by C in Q. The 
strategy given in [3] can be characterized as "identify-then- 
transmit" by first using an asymptotically negligible number 
of transmissions to identify a sensor that is correctly informed 
then letting that sensor transmit the entire codeword. 
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B. Byzantine Attack and Related Work 

The problem considered in this paper is when a fraction /3 
of sensors are Byzantine sensors. The goal of these Byzantine 
sensors is to disrupt the sensor fusion collaboratively. 

We assume that Byzantine sensors have full knowledge 
of the system and impose no restriction on what they can 
transmit. In particular, Byzantine sensors know the transmis- 
sion strategy including the codebook and the polling strategy 
of the fusion center. They also know, of course, the correct 
fusion message. Unlike the misinformed sensors that transmit 
randomly selected messages in [3], Byzantine sensors can be 
malicious sometimes and behave in other times as honest 
sensors in order to evade detection by the fusion center. 
Furthermore, they can coordinate among themselves (unknown 
to both the honest sensors and the fusion center) to launch the 
so-called Byzantine attack. As a result, the capacity achieving 
coding and transmission strategies developed in [3] are no 
longer applicable. 

The notion of Byzantine attack has its root in the Byzantine 
generals problem [4], [5] in which a clique of traitorous 
generals conspire to prevent loyal generals to form consensus. 
It was shown in [4] that consensus in the presence of Byzantine 
attack is possible if and only if less than i of the generals are 
traitorous. Relaxing the strict definition of consensus of [4], 
Pfitzmann and Waidner uses an information theoretic approach 
to show that Byzantine general problem can be solved for 
an arbitrarily large fraction of Byzantine nodes [6]. These 
and other Byzantine consensus results [1] are relevant to the 
current paper only in that they deal with the consensus process 
prior to sensor fusion. 

Countering Byzantine attacks in communication networks 
has also been studied in the past by many authors. See the 
earlier work of Perlman [7] and also more recent review 
[8], [9]. An information theoretic network coding approach 
to Byzantine attack is presented in [10]. Karlof and Wagner 
[11] consider routing security in wireless sensor networks. 
They introduce different kinds of attacks and analyze security 
risks of all major existing sensor network routing protocols. 
Countermeasures and design considerations for secure rout- 
ing in sensor networks are also discussed. It is shown that 
cryptography alone is not enough; cai^eful protocol design is 
necessary. 

There has been limited attempt in dealing with Byzantine 
attacks for sensor fusion. The problem of optimal Byzantine 
attack of sensor fusion for distributed detection is considered 
in [12] where the authors show that exponentially decaying 
detection error probabilities can still be maintained if and 
only if the fraction of Byzantine sensors is less than half. 
A witness-based approach to sensor fusion is proposed by Du 
et. al. [13] where the fusion center and a set of witnesses 
jointly authenticate the fusion data by the use of the Message 
Authentication Code. The authors of [13] are concerned with 
the trustworthiness of the fusion center In contrast, we address 
the problem of sensor fusion with malicious sensors attacking 
the fusion center from within. 



C. Main Result and Organization 

The main result of this paper is to show that, if polling of 
the fusion center is allowed, and the polling is perfect, the 
capacity of sensor fusion in the presence of Byzantine attack 
is again C in Q when /3 < 5 and when /? > ^. 

The converse of the result holds trivially for /3 < | because 
the capacity of the sensor fusion in the absence of Byzantine 
sensors is C. For /3 > |, we show that it is possible for the 
Byzantine sensors to completely defeat the fusion center and 
honest sensors by setting things up so that exactly half the 
sensors act honestly with the true message and the other half 
also act honestly but with a false message. It is thus impossible 
for the fusion center to distinguish the set transmitting the true 
message from the set transmitting the false one, so it cannot 
decode the true message with probability more than i. 

To show the achievability for /3 < i, we propose a transmis- 
sion and coding strategy different from that for misinformed 
sensors [3], for which the capacity achieving strategy can be 
called "identify-then-transmit", where the fusion center first 
identifies an honest sensor, then receives the entire message 
from that sensor. Here we must deal with the situation in 
which a Byzantine sensor may pretend to be an honest sensor. 
The key idea is one of "transmit-then-verify". Specifically, we 
first commit a sensor (Byzantine or honest) to transmit part 
of a codeword and then verify if the sensor is trustworthy. 
After a sensor has transmitted, the fusion center verifies the 
transmission using a random binning procedure. Under this 
procedure, a Byzantine sensor either has to act honestly or 
reveal with high probability its identity. We then have to show 
that the overhead in the verification diminishes as the length 
of the codeword increases. 

This paper is organized as follows. In Section|n| we present 
models for sensors, communication channels, and network 
setup. The main result is given and sketch of proofs are 
presented in Section HiH We conclude in Section Hvl 

II. Model and Definitions 
A. Fusion Network and Communication Channels 

A sensor is Byzantine if it can behave arbitrarily. A sensor is 
honest if it behaves only according to the specified protocol. 
Let P be the probability that a randomly selected sensor is 
Byzantine. With probably 1 — /3, a randomly chosen sensor is 
honest. We assume that the sensor network is large in the sense 
that there are an infinite number of elements. This assumption 
ensures that the probability of all nodes being Byzantine is 
zero. 

Sensors can communicate with the fusion center directly, 
and the transmissions are time slotted. We assume that the 
uplink channel from each sensor to the fusion center is a 
Discrete Memoryless Channel (DMC) {X,y,q{y\x)} where 
X is the input alphabet, y the output alphabet, and q{y\x) 
the transition probability of the channel. The assumption of 
identical channel is restrictive and synchronization difficult 
when the network is large and the fusion center stationary. The 
assumed model is reasonable, however, if the fusion center is 
a mobile access point that can travel around the network, and 
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a sensor only transmits to the fusion center when it is activated 
by and synchronized to the fusion center. 

We assume that there is a polUng channel from the fusion 
center to each sensor. Since the fusion center is not power 
limited, we assume the polling channel is error free with 
infinite capacity. 

B. Transmission Protocol 

Before sensor fusion starts, we assume that the sensor 
network, without error, has agreed upon a fusion message 
W G {1, • • • , M} that is uniformly distributed. The code is 
in general variable length and dynamically generated, so there 
is no single fixed codebook. However, we assume that the 
sensors may have any number of fixed codebooks to use as 
pieces of the code. 

The fusion center polls one node to transmit one symbol 
at each time slot. At time t, the fusion center polls node Kt 
to transmit a symbol Xf . The symbol received by the fusion 
center is then Yf. The fusion center may choose Kt based on 
previously received symbols and polling history K*^^^. 

Since the polling channel has infinite capacity, Kt may choose 
Xt based on all symbols previously received by the fusion 
center F*^^, the polling history A'*^^, and anything else the 
fusion center chooses to send to it. It may also base Xt on all 
previous transmissions that it has made itself, but not those 
made by other sensors, and of course the message W. 

If a sensor is Byzantine, it may also base its choice of Xt 
on all transmitted symbols, including those sent by honest 
sensors, and any additional information the fusion center sends 
to any sensor We also assume that the Byzantine sensors know 
the algorithm the fusion center and honest sensors are using, 
and that they may communicate securely among themselves 
with zero error. 

After the fusion center receives Yt, it decides whether to 
continue polling based on and K*. If it decides to continue, 
then it moves on to the next time slot t + 1 and starts the 
polling step again. Otherwise, it decodes based on collected 
observations. 

C. Achievable Rates and Capacity 

Let N be the random variable representing the total number 
of symbols sent in a coding session. Once the fusion center 
decides it is done polling, it decodes the global message based 
on Y^ and A'^. The decoded message is denoted hy W G 
{1, • • • , A/}. A decoding error occurs if W ^ W. 

The rate of a code is defined as 

. log(M) 
E{N) ' 

where AI is the number of messages and E{N) is the expected 
number of symbols transmitted during a coding session. The 
probability of error is defined as Pg — Pr(VK ^ W), where 
W is the message, uniformly selected from {1, • • • , M}, and 
W is the decoded message. Pe will in general depend on the 
actions of the Byzantine sensors. A rate R is called achievable 
if for any given error e > and any choice of actions by the 
Byzantine sensors, there exists a code with rate larger than 



R — e and probability of error less than e. The capacity of this 
system is defined as the maximum of all achievable rates. 

III. Fusion Capacity 

The main result of this paper is given by the following 
theorem that characterizes the capacity for the fusion network 
described in Sec |lll 

Theorem: The capacity of this system is 

rbyz^f if/3<l/2 
\ 0, if /3 > 1/2 

where C is defined in Q. 

A sketch of the proof of this theorem follows. In Subsec- 
tion IIII-AI we prove the converse. In Subsection IIII-BI we 
describe the coding strategy used to prove achievability. In 
Subsection IIII-CI we define some error events and discuss the 
error probability. Finally, in Subsection IIII-DI we discuss the 
rate of this coding scheme. 

A. Converse 

Suppose that f] — and that all the sensors may com- 
municate with each other with zero error Certainly these 
assumptions cannot decrease the capacity for any f3. Since 
the sensors can communicate with each other, we can think 
of the entire sensor network as a single encoder for the DMC 
with perfect feedback, since the sensors are allowed to know 
all previously received symbols by the fusion center. Thus 
under these assumptions this system reduces to a point-to- 
point DMC with perfect feedback. In that system, the feedback 
does not increase capacity [2], so the capacity is C. Thus, the 
capacity of the sensor network with Byzantine sensors cannot 
have capacity greater than this, so C^y^ < C for all (3. 

Next we show that if /3 > i, then 0'"°^^ = 0. To do this, we 
will show that for any algorithm to be used by the fusion center 
and honest sensors, the Byzantine sensors will be able to make 
it impossible for the probability of error to be made arbitrarily 
small. The scheme performed by the Byzantine sensors to 
accomplish this is as follows. They divide themselves into 
two groups, one with i of the sensors, and one with (3 — ^ 
of the sensors. The sensors in the latter group act exactly like 
honest sensors. Since there is no way for the honest sensors 
to know anything that the Byzantine sensors do not, it will be 
impossible to distinguish an honest sensor from a Byzantine 
sensor acting honestly. The sensors in the former group also 
act exactly like honest sensors, but with a message different 
from the true one. Thus exactly half of the sensors — the honest 
sensors plus the Byzantine sensors that act honestly — will act 
honestly with the true message. The other half of the sensors — 
the rest of the Byzantine sensors — also act honestly but with 
an incorrect message. Thus, since the number of sensors in 
these two groups is the same, no matter what the fusion center 
does, it will not be able to determine which half is reporting 
the true message and which half is reporting the false one, so 
it will not be able to decode the true message with probability 
greater than i. Therefore the converse of the theorem holds. 
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B. Coding Strategy 

To prove the direct part of the theorem, we first describe the 
coding strategy that will achieve this rate. The coding scheme 
can be described as a "transmit-then-verify" procedure. In 
other words, first we ask a sensor to send part of the message 
to the fusion center After that, the fusion center polls other 
sensors to verify whether the received information is correct. 
Thus, if a Byzantine sensor is selected to transmit the message, 
it can send erroneous information, but then with high proba- 
bility it will be discovered to be erroneous in the "verify" step. 
The Byzantine sensor can send the true information, but then it 
will be verified, so the fusion center now has that information, 
and knows it to be correct. As long as the fusion center always 
verifies any information it receives, the Byzantine sensors can 
never get any false information through. The best they can do 
is to prolong the coding process, but we will show that this 
additional overhead can be made to be negligible. 

The coding strategy is as follows. We first break the message 
up into V chunks, such that each chunk contains an equal part 
of the information in the message, and the message will be 
perfectly reconstructible given all the chunks. These chunks 
could be, for example, the v digits representing the message 
W when it is written as a number in a particular base. The 
fusion center will try to obtain the v chunks one at time, and 
verify that each chunk obtained is from an honest transmission. 

Next we describe the two codebooks to be used in the 
uplink transmission over the DMC q{y\x). Take any e > 
and R < C. Let the number of possible messages M ~ 2"^, 
so that the message set is {I,-- - ,2"^} and the set of all 
possible chunks is {1, • • • , 2"^/"}. The first codebook Qi is 
a (2"^/", n/v, e) code to transmit the chunk, where (Af, n, e) 
represents a code over the DMC with M messages, n channel 
uses, and probability of error less than e. When a sensor 
is requested to transmit, say, the ith chunk of the message, 
an honest sensor will use Qi to transmit the ith chunk. A 
Byzantine sensor can choose to act honestly and use Qi to 
transmit the correct chunk, or it can transmit any other signal. 

The second codebook Q2 is a {j, I, e) code used by the 
sensor in the verification process. Specifically, to verify if a 
transmission represents correct information, the fusion center 
uses a random binning technique. It distributes all possible 
chunks into j bins and broadcasts the bin index of each 
possible chunk to the sensors. The fusion center then asks k 
sensors to transmit the bin index of the particular chunk that 
the fusion center is verifying. An honest sensor will transmit 
the bin index to the fusion center using this second codebook 
Q2- For fixed j, the code length I is chosen sufficiently long 
for transmitting the bin index accurately over the DMC. A 
Byzantine sensor, if requested for the index, again can transmit 
arbitrarily including acting honestly by using O2 to transmit the 
correct index. The numbers j and k are functions of decoding 
error e and are chosen sufficiently large to ensure the fidelity 
of verification but not large enough to penalize the rate. We 
comment on the selection of them in Section IIII-CI 

The detailed transmission protocol is as follows. 

0) The fusion center randomly selects a sensor to transmit 
the next chunk (starting at the first chunk). 



1) If the selected sensor is honest, it transmits the entire 
chunk using the codebook Qi. (If the selected sensor is 
Byzantine, it can act arbitrarily). 

2) The fusion center randomly places each element in the 
set of all possible chunks into one of j bins. The fusion 
center randomly selects k sensors, and sends the binning 
to each of them. Each of those k sensors then sends the 
bin index of the chunk back to the fusion center using 
code 

3) If more than half of the k received bin indices match the 
bin index of the chunk that was received in step (1), the 
fusion center accepts that chunk. Otherwise it declines 
it. 

4) If the chunk was accepted, the fusion center keeps the 
same sensor selected and moves on to the next chunk (go 
to step 1). If it was declined, the fusion center randomly 
selects a new sensor and tries again with the same chunk 
(step 0). 

5) Polling stops when all chunks have been received and 
accepted. To complete the coding process, the fusion 
center extracts the original message from the v accepted 
chunks. 

Note that each time we run through steps (1) through (4), we 
use the channel n/v + kl times. 

In step (2), we have used a random binning procedure. This 
is different from the way such a procedure is often used, in 
which it is done just once during the construction of the code, 
but then the codebook is fixed. Here, we actually construct an 
entirely new random binning every time we do step (2). This 
is necessary because if we used some fixed or deterministic 
binning, then if a Byzantine sensor is selected to transmit 
a chunk in step (1), it would know the binning to be used 
beforehand, so it could find a chunk in the same bin as the 
real chunk, which would make the verification useless. The 
probability that the Byzantine sensor selects a chunk different 
from the real chunk but in the same bin must be small, so we 
need dynamic random binning. 

C. Error Events and Error Probability Analysis 

We show next that, with appropriately chosen n, u, k in 
the two codebooks, the probability that a message is decoded 
incorrectly goes to zero, and the decoding process will end 
with an average number of transmissions approximately n + 
0{en). Thus with a message set of size 2"^, and R > C — e, 
we have the proof of the main theorem. 

To analyze the probability of error, we need to define some 
events. Events Ai,A2,A3 are the most basic ways in which 
errors can occur. 'Bi, 6 have to do with the conclusion the 
fusion center reaches, and thus determine how the coding will 
progress. 

• yii: A coding error occurs in step (1), i.e., the transmitted 
chunk is different from the decoded one. 

• A2'. Of the k bin indices that are decoded in step (2), less 
than half of them equal the bin index for the true chunk. 

• A3: For a given pair of distinct chunks, they are both put 
into the same bin in step (2). 

> Si: The chunk is declined in step (3). 
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• "32- A chunk is accepted in step (3) and that chunk is not 
the true one. 

• C: The true chunk is transmitted in step (1). 

The following lemma bounds the probabilities of events 
relevant to the error analysis. 
Lemma 1: Define 

Pi ^ Pr(a3i|e), P2 = Pr(S2|e=), 

P3 - Pr(S2|e). 

For sufficiently large j and k, and no matter what the Byzan- 
tine sensors do, Pr(yii) < e for i — 1,2,3, and 

Pi < Pr(yii)+Pr(yi2), P2 < Pr(yi2)+Pr(yi3), 
P3 < Pr(yii)(Pr(yi2) + Pr(yi3)). 

Proof: Since Qi was constructed to have error probability 
less that e, Pr(yii) < e. 

Now we show Pr(yi2) < e for sufficiently large k. Consider 
one of the k sensors polled in step (2). It will be honest and a 
Q2 error will not occur when it sends its bin index in step (2) 
with probability (1 — /3) (1 — e). These two events are sufficient 
(though not necessary) for the decoded bin index to be the real 
one. Therefore the probability that the decoded bin index is 
the real one is at least (1 — — e), so the probability that 
the decoded bin index is not the true one is no more than a = 
1 — (1 — (1 — e). Thus the number of decoded bin indices that 
are incorrect will be upper bounded by a binomial distribution 
with each one having probability a of being incorrect. Since 
P < 1/2, for sufficiently small e, a < 1/2, so we will assume 
that this is the case. Thus 




where I© holds because (*^) < ^Jj^) for all z e {0, • • • , k}, 
and (|3jl holds because a < 1/2, so the denominator 1 — 2a 
is positive, so the — a*^"'"^ term can be dropped, and because 

log (i^ e) 
- log(4a(l - a)) ' 

then Pr(yi2) < e. 

Next we show Pr(yi3) < e for sufficiently large j. Since 
there are j bins, the probability that two different chunks are 
put into the same bin in step (2) is Thus if j > 1/e, 
Pr(^3) < e- 

Note that pi is the probability that the received chunk is 
declined in step (3) given the true chunk was transmitted in 
step (1). One way for this to happen is for there to be a coding 
error in step (1), i.e., A\ occurs, so the received chunk will not 
be the true chunk, so the polled sensors may not confirm it. 



Note that a coding error does not necessitate the chunk being 
declined, but it does cover a large set of the ways it could 
happen. If A\ does not occur, then the received chunk is the 
true one, so the chunk could only be declined if the majority 
of the bin indices received in step (2) do not match the true 
chunk, i.e., A2 occurs. Thus 

P\ < Pr(^i U A2) < Pr(yii) + Pr(yi2). 

Next, p2 is the probability that an incorrect chunk is ac- 
cepted given that an incorrect chunk is transmitted in step (1). 
If more than half of the decoded bin indices are incorrect (A2), 
then those incorrect bin indices might confirm the incorrect 
chunk. If not, then the only way for the incorrect chunk to be 
accepted is for it to fall into the same bin as the true chunk 
(A3). Thus 

P2 < PriA2 U ^13) < Pr(yi2) + Pr(yi3). 

Finally, p^ is the probability that an incorrect chunk is 
accepted given that the correct chunk is transmitted in step (1). 
In order for this to happen, the decoded chunk must not be 
the true one, so a coding error must occur (^Li). In addition, 
for that decoded incorrect chunk to be accepted, more than 
half of the decoded bin indices must be incorrect iA2) or the 
incorrect chunk must fall into the same category as the real 
one (.A3). Thus 

P3 < Pr(yLi)Pr(yi2 Uyig) < Pr(yLi)(Pr(yi2) +Pr(yi3)). 

■ 

As the coding scheme commences, it moves through a 
number of different states, depending on the number chunks 
the fusion center has received thus far, and whether the 
selected sensor is Byzantine. Depending on the exact sequence 
of events, the fusion center might remain at a certain state for 
some time, requesting the same chunk several times until it 
finds an honest sensor. The progress is probabilistic because 
every time the fusion center selects a sensor it might be 
Byzantine or honest, and every time it receives a transmission, 
a transmission error might or might not occur In fact, the 
progress of the coding scheme can be modeled as a Markov 
process. In particular, it will be a Markov decision process, 
because a Byzantine sensor, if it is selected to transmit a 
chunk, has some choice about what to transmit. That choice 
will influence the probabilities of future events. The Markov 
decision process that we will use to analyze the error proba- 
bility of this scheme is diagrammed in Fig. |2 

The process will have 2v + 3 states. State i, for i — 0, - ■ ■ ,v 
represents the fusion center having successfully received i true 
chunks and the currently selected sensor is honest. State i' is 
the same except the currently selected sensor is Byzantine. 
Finally, state e represents the fusion center having accepted at 
least one false chunk. The decision for the Markov decision 
process will be whether a Byzantine sensor, if it is asked to 
send a chunk in step (1), chooses to send the true chunk or not. 
Thus a decision will only be made when a Byzantine sensor 
has been selected, i.e., we are in one of the i' states. 
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Fig. 2. The Markov decision process used to find the error probability. Dashed 
lines from a state represent the Byzantine sensor choosing to send erroneous 
information, and dotted lines represent the Byzantine sensor choosing to send 
true information. 

States V, v', and e will be terminal states, so an error will 
occur if we reach state e before state v or v'. Define 

d = Pr(error occurs starting from state i), 

= Pr(error occurs starting from state i'). 

In executing the Markov decision process, the Byzantine 
sensors make decisions to maximize the probability of error. 
At the very beginning of the coding scheme, we select a sensor 
which will be with probability 1-/3 honest and probability /3 
Byzantine. Thus, the total probability of error is 

Pe = (l-/3)eo + /3p^. 

From state i, with probability pi the chunk will be declined. 
The fusion center then selects a new sensor, which will be 
Byzantine with probability /? and honest with probability 1— /3. 
Thus we transition to state i' with probability pi/3 and back 
to state i with probability pi{l — (3). With probability p3, an 
incorrect chunk is accepted, so we transition to state e. Finally, 
with probability 1 — pi — the true chunk is accepted, so we 
transition to state i + \. 

From state i', the transition probabilities depend on the 
decision. If the Byzantine sensor chooses not to send the 
true chunk, then with probability p2 the false chunk will be 
accepted, so we transition to state e. Otherwise, the fusion 



center selects a new sensor. Thus with probability (1 — P2)P 
we return to state i', and with probability (1 — p2){l — (3) we 
transition to state i. If the Byzantine sensor decides to send 
the true chunk, then the transition probabilities are essentially 
the same as they were from state i; with probability pi(3 we 
return to state i' , with probability pi(l — (3) we transition to 
state i, with probability p^ we transition to state e, and with 
probability 1 — pi — we transition to state i + 1'. 
From these transition probabilities, we see that 

Ci = P3 + (1 -Pi -P3)ei+i +Pil3e'i +pi(l - /3)ei, 
e'i = max{p2 + (1 -P2)/3e- + (1 -p2)(l - /3)ei, 
P3 + (1 - Pi - P3)e-+i + PiPe[ 
+ pi(l-/3)e,}. 

The maximum represents the Byzantine sensors always mak- 
ing the decision that maximizes the error probabilities. In 
addition, if we arrive at either state v or v' , the fusion center 
has received the entire message without error, so e^, = = 0. 

D. Code Rate 

We also need to consider the rate of this code. To show that 
the rate can be made arbitrarily close to C, we need to show 
that the expected number of channel uses E(A^) converges to 
n as e goes to zero. Each time a chunk is transmitted {i.e., 
each time we run through steps (1) to (4)), the channel is 
used n/v + kl times. All we need to know is the expected 
number of chunks that are transmitted in the entire coding 
scheme. To find this, we will use a similar Markov decision 
process as the one described above. The only differences lie in 
the fact that we are not interested in whether an error occurs, 
only in how long it takes to finish. Thus we remove state 
e and redefine states i and i' to represent the fusion center 
having accepted i states, but with all of them not necessarily 
correct. Thus every time we would transition to state e, we 
actually transition somewhere else. For instance, if we are in 
state i' and the Byzantine sensors choose to send erroneous 
information, then with probability p2, the chunk is accepted, 
so we transition to state i + V instead of e. Let qi and q'^ be the 
expected number of steps made in the Markov decision process 
before reaching one of the terminal states (v or v') given that 
we start at state i or i' respectively and the Byzantine sensors 
make decisions that maximize the expected number of steps. 
Then 

* = 1 + (1 -pi)qi+i +PiPq[ +pi(i - I3)qt, 

q[ = max{l + P2q'i+i + (1 - P2)Pq[ , 
+ (l-p2)(l-/3)g„ 
1 + (1 -Pi)q[+i +PiH +Pi(l - P)q^}■ 

Again, q.^ =(?(,= 0. 

Lemma 2 (Average Code Length): There exist n, v, j, and 
k as functions of e such that the error probability and 

the expected number of channel uses E{N) ^ n as e ^ 0. 

Proof: Take j and k large enough for Lemma ^ to hold, 
and n and v such that 

2 1 klv 

->v>-, n> . (5) 

e e e 
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We define fi, f[ for i = 0, • • • , u as follows. Let fv — fv — 
and for i < v, 

/, ^ P3 + (1 - Pi -P3)/.+i+pi/3/;+pi(i -/?)/., (6) 
/L + -/?)/., (7) 
fU " P3 + (1 - Pi - P3)/;+i +P1/3/;, (8) 

^ max{/;^„/;,J. (9) 



Thus 



The only difference between fi, f[ and e^, e'^ is that the (1— ^2) 
factors have been dropped from the second two terms in 0. 
Thus Ci < fi, < f[, for all i. Fix some i G {0, • • • ,v — 1}. 
If /; = then by Q 



P2 



l-(3 

Combining this with (|6|l gives 

P3 , P1P2P 



fi — fi+1 + 



' 1-pi ' (l-pi)(l-/3)' 
which with ilQ\ produces 



(10) 



(11) 



P3 



P1P2P 



P2 



I -PI 1-/3 



f , P3 , P2(l 
= fi+1 + 1 — 7^- U^J 



If // = /j.b, then combining ^ with (|8} gives 



P3 



1-Pl 



+ Pl(l - /?)/,+! + (1 - - /?))/;+!. (13) 



Note that (I12> and (I13> are what would be if equaled 
/i' a oi" fl b respectively. However, these expressions are not 
necessarily equal to f^^ and J/^, because we have used (|6j 
to derive both of them, which contains the real value of 
Still, because of the definition of in (|9jl, the larger of il2\ 
and ( I13t will be the true value of 

We will now show by induction that — fl^ for i = 
0, • • ■ ,v — 1. For i = w — 1, since fi, = f'^ = 0, it is clear that 
the expression in M2\ is larger than that in (I13> . so f'^_i = 
f'y-i a- Now we assume that //^^ — f[_^^ ^ and show that 

/; ^fia- By m. 



fi+1 ~ 



P2 



1-/3 

Thus, if /j- = /,' ^, ( I13> becomes 

n = T^+pi(i-/3)/»+i 

P2 



+ (l-pi(l-/3)) 

P3 



+ /«+l 



1-pi 



+ + 



1-/3 
P2(l-Pi(l-/3)) 
1-/3 



Since the expression in il2\ is larger than this, // = f^^. 
Therefore holds for « = 0, • ■ • , w — 1, so 



P3 



P1P2P 



I -Pi (l-Pi)(l-/3) 



[v - i). 



(14) 



P. 



< 



= /o + 



(1 - l3)eo + pe'o 
(1 - /3)/o + /3/^ 
P2/3 



< 



1-/3 

PlP2/3+P3(l -/?) _^ P2/3 

(l-pi)(l-/3) 
4e^/3 + 2£^(l -/3) 
(l-2e)(l-/3) 
8/3 + 4(1-/3) , 2/3 



-/3 
2e/3 



(15) 



(16) 



(17) 



,(l-2e)(l-/3) 1-/3, 

where il5i is from ( II PL il6i is from ( I14t . and ilH is from 
Lemma[2and Thus Pg ^ as e — > 0. 

Now we analyze qi, q'^ to find K{N). Combining the expres- 
sion for qi in @ with either expression for q'- in the maximum 
in (|4} yields expressions of the form 

q^ = i + 7 + %+i + (i-%:+i, (18) 

q[ = l+l' + d'q,+i + il-S')q[^„ (19) 

where 7,7' > and 6,6' G [0, 1]. The quantity 7 represents 
the expected number of state transitions between states i and 
i' before moving on to state i + 1 or i + 1', given that we start 
at state i', and 6 represents the probability that when we do 
transition away from states i and i', we go to state i + 1 and 
not i + V. The quantities 7' and 6' are the same except starting 
at state i'. Obviously, the values of these will depend on which 
element of the maximum is larger, but for our current purposes 
it only matters that the expressions will have this form. 

We will now show by induction that qi — (7,4.1 > 1 and 
q'i ~ I'i+i — 1 i — 0, - ■ ■ ,v — 1. First consider i — v — 1. 
Qv ^ q'y ^ 0, so by ([^ and ([19}, q^-i = 1+7 and q[,_-^ = 
1 + 7'. Thus qv^i — qv > 1 and qy_i — q'y > 1. Now we 
assume that qi+i — qi+2 > 1 and q^^^ — q[j^2 ^ 1 and show 
that qi — qi+i > 1 and q'^ — q'^^i > 1. By assumption and ilSi . 

Qi-qi+i = <5(9,+i - g,+2) + (1 - - g-+2) 

> 6 + {1-6) 
= 1. 

Similai-ly by ([19}, 

q'i-q'i+i = <5'(%+i - g»+2) + (1 - (^Ol^i+i - 9^+2) 

> 6' + {1-6') 
= 1. 

Thus qi — qi+i > 1 and q'^ — q'^_^_^ > 1 for i = 0, • • • , w — 1. 
In particular, q'^_^_^ < q'i — 1- 

Suppose the first element of the maximum is larger in ©. 
Then 

q', = l+p29-+i + (l-p2)/39- + (l-p2)(l-/3)'7» 

< 1 + P2{q[ - 1) + (1 - P2)/3g- + (1 - P2)(l - 



This can be rewritten 



(20) 
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Now suppose the second element of the maximum is larger in 
0. Then 

(it = ^ + ~ pi)ql+i + piPq't + pi{i - f3)qt 
< 1 + (1 - piM - 1) + PiPq't + - P)q^■ 

This can also be rewritten to ( I20K so (|20} must hold no matter 
which value is larger in the maximum in (|4}- Thus 



Qi <l + {I- Pi)qi+i + PiP 
This can be rewritten 

<?^ < 1 

SO 



1-/? 



+ qA +pi{l-l3)qt. 



Pi 



9» < 1 



(l-pi)(l-/3) 
Pi 



+ qi+i, 



{v - i). 



(21) 



Let V be the random variable denoting the total number of 
chunks that are requested in the entire coding session. Since 
we start at state with probability 1 — (3 and at state 0' with 
probability 

E(y) = (1 - /3)go + No 
/3 



< 90 



< 1 



1-/3 



Pi 



fi 



{l-pi){l-f3)J - ' 1-/3 
where (I22> is from (|20} and ( I23> is from ( I21> . Thus 



(22) 
(23) 



E(/v) = mv) [- + ki 

\ V 



< 



Pi 



(l-pi)(l-/3) 
Pi 



1-/3 
/3 1 



- + A:Z 



(l-pi)(l-/3) l-/3z; 

Pi \ klv /3 fcZ 



< n 



(1 -pi)(l -I3)jn 1- 13 n 
2e (3 



+ 



(l-2e)(l-/3) 1-/3 

2e \ /3 



(l-2e)(l-/3)y 1-/3 
2(1 + e) , /3(l + e) , 



(24) 



1 e 



,(l-2e)(l-/3) 1-/3 
where ( I24> is from Lemma ^ and (jSjl. Thus E,{N) ^ n as 



e ^ 0. 

Therefore the rate of this code, 

nR 



E(/V)' 

converges to i? as e goes to 0. Thus C is achievable. 

IV. Conclusion 

We showed in this paper that, by cooperative sensor fusion, 
the presence of Byzantine sensors can be completely mitigated 
when the Byzantine sensor population is less than half of the 
total number of sensors, but no information can be transmitted 
when at least half of the sensors are Byzantine. We proposed a 



"transmit-then-verify" scheme that forces a Byzantine sensor 
to either act honestly or reveal its Byzantine identity. The key 
of this idea is the use of random binning in sensor polling. 
Note that the random binning in our strategy is not a random 
coding argument; it is an actual randomized transmission 
protocol. 

Several simple generalizations can be made. The network 
does not have to contain infinite number of sensors. For a finite 
size network, we will assume that a deterministic /3 fraction of 
the sensors are Byzantine. In that case, all the sensors can be 
polled when verifying a transmission. Thus if less than half of 
the sensors are Byzantine, information will always be correctly 
verified. This requires a constant and hence asymptotically 
negligible number of channel uses, so polling every sensor 
instead of a random subset does not effect the rate. We can also 
relax the assumption that the consensus is perfect by assuming 
that there is a fraction of sensors that are are misinformed as in 
[3]. In such a circumstance, a similar coding algorithm as the 
one described in this paper can be used, and the full channel 
capacity can be achieved as long as the correctly informed 
honest sensors outnumber the Byzantine sensors, though the 
proof of this is nontrivial. 
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